11  Understanding Python

11.1 Why Python for Business Analytics?

Python is one of the most widely used programming languages in business analytics and data science due to its simplicity, flexibility, and extensive ecosystem of libraries. In this book, we will use Python to implement various statistical techniques, data analysis methods, and visualization tools to support business decision-making.

###yes Key Advantages of Python in Business Analytics

  • Easy to Learn and Use → Python has a simple syntax that makes it easy for business analysts to perform statistical analysis and automation.
  • Rich Ecosystem → Python has powerful libraries like Pandas, NumPy, SciPy, Statsmodels, and Scikit-learn, which provide built-in functions for statistical analysis.
  • Scalability and Efficiency → Python can handle large datasets efficiently and integrates well with databases, cloud computing, and machine learning models.
  • Extensive Visualization Support → Libraries like Matplotlib and Seaborn make it easy to create meaningful visualizations.
  • Automation and Integration → Python can automate repetitive tasks, streamline workflows, and integrate with tools like Excel, SQL databases, and web applications.

11.1.1 Differences Between Python and R

Python and R are two of the most widely used programming languages in business analytics, data science, and statistical computing. Both languages have their strengths, but they cater to slightly different needs. Below is a comparison of Python vs. R in various aspects.

Purpose and Usage

Aspect Python R
Primary Use General-purpose programming, data science, automation, web development Statistical computing, data visualization, academic research
Best For Machine Learning, Automation, Data Science Statistical Analysis, Data Visualization, Research

Ease of Learning

Aspect Python R
Syntax Simple, readable, similar to English More complex syntax, designed for statisticians
Learning Curve Easier for beginners, widely used in software development Steeper learning curve, but powerful for statistical analysis

Libraries and Packages

Aspect Python R
Data Manipulation Pandas, NumPy dplyr, data.table
Statistical Analysis Statsmodels, SciPy Base R, car, lme4
Machine Learning Scikit-learn, TensorFlow, PyTorch caret, randomForest, xgboost
Data Visualization Matplotlib, Seaborn, Plotly ggplot2, lattice, plotly

Data Handling and Performance

Aspect Python R
Data Handling Handles structured and unstructured data well Primarily designed for structured data
Big Data Support Integrates with Spark, Dask for large datasets Not optimized for big data but integrates with Hadoop and Spark
Speed & Efficiency Generally faster for ML and large datasets Slower for big data but optimized for statistical tasks

Business & Industry Use Cases

Aspect Python R
Used In Finance, AI, Web Development, Automation, ML Academic Research, Healthcare, Pharma, Government
Common Applications AI-driven analytics, automated reporting, cloud computing Statistical modeling, survey analysis, experimental research

Community and Industry Support

Aspect Python R
Community Large, growing community in AI & ML Strong academic and research community
Industry Adoption Used by companies like Google, Netflix, Tesla Preferred by universities, research institutions

Integration and Flexibility

Aspect Python R
Integration Works well with APIs, web apps, databases Strong integration with statistical packages
Flexibility More versatile, can be used in different fields Specialized for data analysis and statistics

📌 Which One Should You Use?

  • Use Python if: You need machine learning, automation, web development, or large-scale data processing.
  • Use R if: You need advanced statistical analysis, data visualization, or academic research tools.
  • Use Both if: Your work involves both statistical analysis and machine learning.

11.2 Installing Python and Anaconda navigator

11.2.1 Installing Python

Download Python

  • Visit the download page of official Python website
  • Go to the Downloads section. The website typically suggests the best version for your operating system.
  • Click on the download link for your operating system (Windows, macOS, Linux/UNIX).

Install Python

  • After downloading, run the installer.
  • For Windows: Ensure you check the box that says “Add Python to PATH” before you click “Install Now”.
  • Follow the prompts in the Python Install Wizard.

Verify Installation

  • Open your command line (Command Prompt on Windows, Terminal on macOS and Linux).
  • Type python –version and press Enter. This should display the version of Python that you just installed.

11.2.2 Installing Anaconda Navigator

Download Anaconda

  • Visit the Anaconda download page
  • Scroll down to the Anaconda Installers section.
  • Download the appropriate version for your operating system.

Install Anaconda

  • Run the downloaded installer.
  • Follow the prompts in the Anaconda Install Wizard. Accept the default settings unless you have specific preferences.
  • It’s generally recommended to allow Anaconda to add its executable to your PATH environment variable.

Verify Installation

  • Open Anaconda Navigator
  • For Windows: Search for Anaconda Navigator in the Start menu.
  • For macOS/Linux: Use the terminal or search in your applications folder.
  • If Anaconda Navigator opens successfully, the installation is complete.

11.2.3 Import the following packages

  • Numpy – A Python library that is used for numerical mathematical computation and handling multidimensional ndarray, it also has a very large collection of mathematical functions to operate on this array.
  • Pandas – A Python library built on top of NumPy for effective matrix multiplication and dataframe manipulation, it is also used for data cleaning, data merging, data reshaping, and data aggregation.
  • Matplotlib – It is used for plotting 2D and 3D visualization plots, it also supports a variety of output formats including graphs for data.
Code
Sample python code
print('Hello, Python!')
Hello, Python!

11.2.4 Install pandas and matplotlib packages

!pip3 install pandas

!pip3 install matplotlib

Code
Sample plot using python
import numpy as np
import matplotlib.pyplot as plt

r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
fig, ax = plt.subplots(
  subplot_kw = {'projection': 'polar'} 
)
ax.plot(theta, r)
ax.set_rticks([0.5, 1, 1.5, 2])
ax.grid(True)
plt.show()